Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction

نویسندگان

Scott Cederberg

Dominic Widdows

چکیده

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-noun similarity learned automatically from coordination patterns to previously extracted correct hyponymy relations, we achieve roughly a fivefold increase in the number of correct hyponymy relations extracted.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using LSA and Noun Coordination Information to Improve the Recall and Precision of Automatic Hyponymy Extraction

متن کامل

Japanese Hyponymy Extraction based on a Term Similarity Graph

Semantic relations between words, such as hyponymy, synonymy and meronymy, have various information access applications (e.g. Web search) and the automatic extraction of such relations from corpora is an important research problem in natural language processing. For the Japanese language, there exist several linguistic resources that contain these relations, such as the Japanese Wordnet, Nihong...

متن کامل

Noun-Phrase Analysis in Unrestricted Text for Information Retrieval

Information retrieval is an important application area of natural-language processing where one encounters the genuine challenge of processing large quantities of unrestricted natural-language text. This paper reports on the application of a few simple, yet robust and efficient nounphrase analysis techniques to create better indexing phrases for information retrieval. In particular, we describe...

متن کامل

بهبود خلاصه سازی خودکار متون فارسی با استفاده از روش‌های پردازش زبان طبیعی و گراف شباهت

A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources. The automatic summarization of tex...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction

نویسندگان

چکیده

منابع مشابه

Using LSA and Noun Coordination Information to Improve the Recall and Precision of Automatic Hyponymy Extraction

Japanese Hyponymy Extraction based on a Term Similarity Graph

Noun-Phrase Analysis in Unrestricted Text for Information Retrieval

بهبود خلاصه سازی خودکار متون فارسی با استفاده از روش‌های پردازش زبان طبیعی و گراف شباهت

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

عنوان ژورنال:

اشتراک گذاری